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The sequence of the spike (also called peplomer or E2) protein gene of the Mebus strain of bovine coronavirus (BCV) 
was obtained from cDNA clones of genomic RNA. The gene sequence predicts a 150,825 mol wt apoprotein of 1363 
amino acids having an N-terminal hydrophobic signal sequence of 17 amino acids, 19 potential N-linked glycosylation 
sites, a hydrophobic anchor sequence of approximately 17 amino acids near the C terminus, and a hydrophilic cysteine- 
rich C terminus of 35 amino acids. An internal Lys-Arg-Arg-Ser-Arg-Arg sequence predicts a protease cleavage site 
between amino acids 768 and 769 that would separate the S apoprotein into SI and S2 segments of 85690 and 65153 
mol wt, respectively. Amino terminal amino acid sequencing of the virion-derived gplOO spike subunit confirmed the 
location of the predicted cleavage site, and established that gpl 20 and gpl 00 are the glycosylated virion forms of the 
Si and S2 subunits, respectively. Sequence comparisons between BCV and the antigenically related mouse hepatitis 
coronavirus revealed more sequence divergence in the putative knob region of the spike protein (SI) than in the stem 
region (S2). © 1990 Academic Press, Inc. 


The bovine coronavirus (BCV) is an important cause 
of neonatal calf diarrhea {14, 26) and may also be the 
cause of winter dysentery in adult cattle (30). The 
mechanisms by which BCV causes disease and persis¬ 
tent infection are not understood, nor are current vac¬ 
cines universally regarded as effective. Toward these 
ends, we have begun a detailed study of the BCV pro¬ 
tein and genome structure. 

BCV is comprised of four major structural proteins 
( 17). These are (i) a 200-kDa spike (peplomer) glycopro¬ 
tein (S), that exists on the virion as cleaved subunits of 
approximately 120 and 100 kDa, (ii) a 140-kDa glyco¬ 
protein (HE) that has both hemagglutinating (18) and 
esterase (37) activities, and which is comprised of two 
identical, disulfide-linked 65-kDa subunits (10, 12, 16, 
28), (iii) a 26-kDa integral membrane glycoprotein (M) 
(21), and , (iv) an internal phosphorylated nucleocapsid 
protein (N) (21). Of these, the S protein is presumed to 
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be the major structure by which coronaviruses attach 
to cells and initiate infection (reviewed by Spaan et al. 
(34)). The HE protein, however, may also bind to cells 
to initiate infection, and for BCV, the relative impor¬ 
tance of these two proteins in initiating infection is not 
known. Both S and HE are probably important in induc¬ 
ing immunity since antibodies to each are known to 
neutralize virus infectivity in cell culture and in calves 
(8, 9). S and HE, therefore, may both be useful in devel¬ 
oping effective engineered vaccines against BCV. 

cDNA cloning of BCV genomic RNA was accom¬ 
plished essentially as previously described ( 11, 21) ex¬ 
cept that random 5-mer oligodeoxynucleotides (Phar¬ 
macia) and 17-mer oligodeoxynucleotides of specific 
sequences were used as primers for first-strand syn¬ 
thesis. Clones were mapped relative to one another 
and to the 3' end of the genome using a matrix spot 
hybridization technique. Some clones were sequenced 
by the chemical method of Maxam and Gilbert (26) and 
some by the dideoxynucleotide-induced chain termina¬ 
tion method of Sanger (31) as described by Kraft et al. 
(19) using Sequenase enzyme (United States Biochem- 
icals). For much of the sequencing, restriction endonu¬ 
clease fragments were subcloned into the pGEM4Z 
vector (Promega) and forward and reverse sequencing 
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Fig. 1. Gene map of the BCV genome. cDNA clone positions, and strategy for sequencing the S gene. 


primers for the pGEM vectors were used. Sequence- 
specific oligodeoxynucleotides were also synthesized 
and used for sequencing within certain regions of the 
large clones. 

The amino-terminal ends of purified gp120 and 
gplOO subunits were subjected to sequencing by the 
method of Matsudaira (24). Unlabeled BCV was puri¬ 
fied by isopycnic sedimentation in sucrose gradients 
and the proteins were electrophoretically separated af¬ 
ter reduction in 2-mercaptoethanol (17) and elec- 
troblotted (13) onto polyvinylidene difluoride mem¬ 
brane (24). Proteins were visualized by staining with 
Coomassie brilliant blue and the gp120 and gplOO 
bands were excised and shipped to Dr. Matsudaira for 
analysis. 

Complete sequencing of clone MA7 which extends 
4.2 kilobases from the 3' end of the genome (Fig. 1) 
revealed a continuous open reading frame located on 
the 5' side of the ORF for a potential 4.9-kDa protein 
(Abraham et ai, to be published elsewhere). The de¬ 
duced amino acid sequence of the extended ORF dem¬ 
onstrated high sequence similarity to the C-terminal 
end of the antigenically related MHV-A59 (22) and 
MHV-JHM (32) S proteins, both antigenic homologs of 
the BCV S protein (13). These data suggested that the 
S protein gene of BCV lies in the same relative position 
on the genome as does the spike protein gene of MHV. 
To complete the sequencing of the S gene, both 
strands of three clones, II, HPA2, and G6, generated 
by random priming, and three clones, LK5, LP6 and 
29, generated by specific priming, were sequenced 
(Fig. 1). 

The total sequence for the putative S ORF extended 
to a position 7.4 kb from the 3' end of the genome and 
contained 4089 bases (Fig. 2). We conclude this ORF 
to be the S gene since it potentially encodes a 1363 
amino acid protein of 150,825 Da, the approximate 
size of the unglycosylated spike precursor (70), and be¬ 


cause its deduced amino acid sequence shows exten¬ 
sive sequence similarity throughout with the S proteins 
of both strains of MHV. Five other open reading frames 
ranging in size from 34 to 66 amino acids were also 
found within the S gene sequence in the plus one read¬ 
ing frame, but their significance is not known at this 
time. The putative S ORF is preceded immediately up¬ 
stream (beginning at base 12 in Fig. 2) by the consen¬ 
sus CYAAAC sequence thought to play a role in leader 
priming of coronavirus transcription. The sequence is 
also found three times within the S ORF, beginning at 
positions 817,1667, and 3776, but it is not established 
that transcripts initiate at any of these sites. 

Five features of the deduced BCV S protein reflect 
the properties of four other coronavirus spike proteins 
that have been characterized to date from nucleotide 
sequence data (7, 2, 15, 20, 22, 27, 29, 32). (i) There is 
an N-terminal hydrophobic stretch of amino acids 
which predicts a signal peptide with a cleavage site be¬ 
tween amino acids 17 and 18 (38). (ii) There are 19 po¬ 
tential asparagine-linked glycosylation sites that could 
give rise to the only kind of glycosylation demonstrated 
for this protein (Hogue and Brian, unpublished data; 
10). (iii) There is a hydrophobic stretch of 17 amino 
acids near the C terminus that could serve as a stop- 
transfer and anchor sequence, (iv) There is a stretch of 
8 amino acids on the immediate N-terminal side of the 
predicted anchor sequence (-K-W-P-W-Y-V-W-L-, be¬ 
ginning with amino acid 1305) that is identical in all co¬ 
ronavirus S proteins sequenced to date, (v) There is a 
cysteine-rich hydrophilic C-terminus of 35 amino acids 
that is probably the intravirion domain. In common with 
MHV- (22, 32) and IBV (7, 2, 20, 27), but not in common 
with TGEV (15, 29: Tung and Brian, unpublished) and 
FIPV (75), is also an internal sequence of basic amino 
acids that, in the case of MHV and IBV, lies on the im¬ 
mediate N-terminal side of the protease cleavage site 
(6, 22). In BCV the sequence is K-R-R-S-R-R beginning 
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Fig. 2. Nucleotide sequence of the S gene and its deduced amino acid sequence. The nucleotide sequence shown begins with the TAG 
termination codon of the HE gene (underlined) 17 bases upstream of the presumed S start site (7407 bases from the poly(A) tail), and ends with 
the TAA termination codon of the S protein. The first three amino acids of the putative 4.9-kDa protein are shown beginning at base position 
4099. Consensus CYAAAC sequences are boxed. The presumed amino-terminal signal peptide and carboxy-terminal anchor sequences are 
underlined. Potential N-linked glycosylation sites (NXS or NXT, where X + P) are boxed. The proteolytic cleavage site separating SI and S2 is 
identified with an arrow. The extended sequence of amino acids missing in MHV JHM is identified by individually underlined amino acids, and 
that missing in MHV A59, by asterisks. 


with amino acid 763, and, on the basis of the pattern 
in MHV and IBV, predicts a cleavage between amino 
acids 768 and 769 (note arrow in Fig. 2). Cleavage at 
this point would divide the unglycosylated S protein 
into an N-terminal segment of 85,690 Da (SI) and a C- 
terminal segment of 65,153 Da (S2). 

From amino acid sequencing studies, no N-terminal 
sequence could be obtained from the virion-derived 
120-kDa subunit, possibly because of N-terminal 


blockage. The N-terminal sequence of the 100-kDa 
subunit could be obtained, however, and was deter¬ 
mined to be X-I-T-T-G-Y-X-F-, identifying the first amino 
acids downstream from the predicted internal cleavage 
site. These results confirmed the predicted internal 
cleavage site and established that the 120-kDa subunit 
is SI and the 100-kDa subunit is S2. 

The BCV and MHV S proteins show remarkable se¬ 
quence homology suggesting that these viruses are re- 
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Fig. 3. Structural comparison of the S proteins of MHV-JHM, MHV-A59, and BCV. Sequences are aligned for maximum homology. A sequence 
found in BCV but not found in MHV-JHM or MHV-A59 is expressed as a gap (broken line) in the MHV sequences. Putative N-terminal signal 
peptides and C-terminal anchor sequences are boxed. Vertical lines above the sequence indicate potential asparagine-linked glycosylation 
positions, and below the sequence, cysteine positions. The identified (BCV, MHV-A59) and putative (MHV-JHM) proteolytic cleavage sites are 
identified by arrows. 


cently diverged. After aligning sequences for maximal 
homology, the following points emerge, (i) Relative to 
BCV, a large deletion appears in the MHV SI subunits. 
For JHM it is a contiguous gap of 138 amino acids, and 
for A59 it is a discontiguous gap of 50 amino acids 
(Figs. 2 and 3). The function of the additional sequence 
in the BCV Si subunit is not known, but it is possibly a 
structure that interacts in some way with the HE glyco¬ 
protein, a structural protein not found on MHV ( 13, 34) 
except under certain rare conditions {33). No electron 
micrographic or chemical data exist, however, to sug¬ 
gest that S and HE do physically interact (3, 17, 18). It 
is interesting to note that the entire region in the BCV 
S protein corresponding to the gap region of the JHM 
S protein is especially rich in cysteine residues and 
contains 15 (26%) of the 56 total cysteines in the BCV 
S protein (Figs. 2 and 3). This suggests that this part 
of the molecule may be important for intramolecular or 
intermolecular disulfide linkages, (ii) Exclusive of the 
large gap in the MHV sequences, the Si subunits of 
JHM and A59 show 62 and 60% identity, respectively, 
with BCV, and the S2 subunits show 75 and 74%, re¬ 
spectively. Throughout the S protein, 41 of 56 cysteine 
positions and 13 of 19 potential N-linked glycosylation 
sites are conserved. The internal proteolytic cleavage 
position (not yet confirmed for JHM) is also conserved. 
The pattern of greater amino acid sequence diver¬ 
gence in the SI subunit is consistent with the model of 
Cavanagh ( 4) and De Groot et at. (7) which proposes 
that the Si subunit comprises the exposed bulbous 
structure of the spike and probably contains most (5), 
but not all {23, 36), of the neutralizable antigenic sites. 
It is the structure most likely to undergo changes as a 
result of immunologic selective pressures. 

Fusion of cells in culture is one biological activity as¬ 
sociated with cleavage of the MHV S protein (35). De¬ 
spite its extensive sequence similarity with the MHV S 
protein, however, the BCV S protein shows little fusion 


activity. In fact, fusion is a behavior we have not ob¬ 
served with the Mebus strain of BCV even though the 
S protein is primarily in the cleaved form on the virion 
(73, 17). It is not clear why BCV and MHV behave so 
differently in their fusogenic properties, but functional 
evaluation of sequence differences near the cleavage 
sites of these two viruses may aid in clarifying the 
mechanisms of fusion by MHV. This is especially inter¬ 
esting since hydrophobic regions, common at the 
cleavage sites on fusion proteins of paramyxoviruses 
and myxoviruses, are absent in the MHV S protein {22) 
and different mechanisms of fusion may be employed. 
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